Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 1 de 1
Filter
Add filters

Database
Language
Document Type
Year range
1.
Infect Dis Poverty ; 11(1): 50, 2022 May 04.
Article in English | MEDLINE | ID: covidwho-1883543

ABSTRACT

BACKGROUND: Influenza B virus can cause epidemics with high pathogenicity, so it poses a serious threat to public health. A feature representation algorithm is proposed in this paper to identify the pathogenicity phenotype of influenza B virus. METHODS: The dataset included all 11 influenza virus proteins encoded in eight genome segments of 1724 strains. Two types of features were hierarchically used to build the prediction model. Amino acid features were directly delivered from 67 feature descriptors and input into the random forest classifier to output informative features about the class label and probabilistic prediction. The sequential forward search strategy was used to optimize the informative features. The final features for each strain had low dimensions and included knowledge from different perspectives, which were used to build the machine learning model for pathogenicity identification. RESULTS: The 40 signature positions were achieved by entropy screening. Mutations at position 135 of the hemagglutinin protein had the highest entropy value (1.06). After the informative features were directly generated from the 67 random forest models, the dimensions for class and probabilistic features were optimized as 4 and 3, respectively. The optimal class features had a maximum accuracy of 94.2% and a maximum Matthews correlation coefficient of 88.4%, while the optimal probabilistic features had a maximum accuracy of 94.1% and a maximum Matthews correlation coefficient of 88.2%. The optimized features outperformed the original informative features and amino acid features from individual descriptors. The sequential forward search strategy had better performance than the classical ensemble method. CONCLUSIONS: The optimized informative features had the best performance and were used to build a predictive model so as to identify the phenotype of influenza B virus with high pathogenicity and provide early risk warning for disease control.


Subject(s)
Amino Acids , Influenza B virus , Algorithms , Amino Acids/genetics , Influenza B virus/genetics , Machine Learning , Virulence
SELECTION OF CITATIONS
SEARCH DETAIL